Running head : XML INTEGRATION AND TOOLKIT FOR B 2 B APPLICATIONS XML integration and toolkit for B 2 B applications
نویسندگان
چکیده
This paper presents a Web based data integration methodology and tool framework, called XTIME, for the development of Business-to-Business (B2B) design environments and applications. X-TIME provides a data model translator toolkit based on an extensible metamodel and XML. It allows the creation of adaptable semantics oriented metamodels to facilitate the design of wrappers or reconciliators (mediators) by taking into account several characteristics of interoperable information systems such as extensibility and composability. X-TIME defines a set of meta-types for representing meta-level semantic descriptors of data models found in the Web. The meta-types are organized in a generalization hierarchy to capture semantic similarities among modeling concepts of interoperable systems. We show how to use the X-TIME methodology to build cooperative environments for B2B platforms involving the integration of Web data and services. XML for B2B applications 3 XML integration and toolkit for B2B applications B2B applications based on business information systems interoperability are increasingly available on the Internet. The Gartner Group (InfoWorld, 2000) estimates that B2B (B2B, 2001) revenue worldwide will reach $7.29 trillion dollars by 2004. These emerging systems involve the exchange of both data and services among business information systems. They have created many challenges, including the need for novel interoperability techniques and architecture. Emerging B2B integration approaches must answer several questions: 1) how to process and share data in various business formats? And 2) how to integrate various business functionalities into Web services? Interoperability is generally hampered by heterogeneity issues. Platform (hardware, software, communication) heterogeneity is resolved by communication standards and protocols such CORBA, IP and HTTP. Syntactic heterogeneity requires common or pivot metamodels to represent the data of the participating systems. Finally semantic heterogeneity, which is difficult to tackle, requires semantic models and languages that can capture the meaning of the representation concepts of different information systems. In addition to traditional data integration concerns, web-service integration must allow the exchange of business services and processes by standardizing 1) low level communication systems and protocols (SOAP: Simple Object Access Protocol), 2) data presentation format and definition languages (WSDL: Web Services Description Language), and 3) access and classification of Web services (UDDI: Universal Discovery, Description and Integration). WSDL is a web-service definition language aimed at the resolution of structural data heterogeneity while UDDI (UDDI, 2002) provides a library of Web services and their specifications. XML is increasingly used in the development of Web services and B2B integration and is emerging as a de facto standard for data exchange in networked environments (Abiteboul S., XML for B2B applications 4 Buneman P., Suciu D., 2000; XML, 2000). XML (eXtensible Markup Language) is an open textual language that provides a structural information description and relative semantics to data (Pardi W. J., 1999). XML is more than a tool or language for separating content from presentation: it is a meta-language from which more than 300 languages (ZapThink, 2001) have been developed, including Astronomical Dataset Markup Language (ADML, 2001), Advertising XML (adXML, 2001), Biopolymer Markup Language (BIOML, 2001), Genome Annotation Markup Elements (GAME, 2001). The increased availability of XML in various domains makes it a good choice for an integration pivot metamodel for the design and translation of interoperating system schemas. As a metamodel, XML allows users to define schemas in the form of XML DTD (Document Type Declaration) or grammars that are uniform syntactic elements for representing the conceptual characteristics of information systems. Using XML reduces the complexity of reconciling structural heterogeneity among systems. For example, in pre-integration data model translation step, the number of required translation tools can be reduced from O (N) to O (N). Several major database system providers have added XML capabilities to their products. However, XML-enabled DBMS (Data Base Management System) market is emerging, representing less than 1% of the total DBMS market ($77 millions (IDC, 2001)). The leaders of this emerging market are Software AGS (40.5% market share) followed by Corp eXcelon (23.3% market share), Compute Associated International Inc. (19.4%), and Poet Software Corp. (1.3%). Three main approaches have been used to provide XML capabilities in databases: Non-XML-native information systems use traditional data models to represent XML documents, XML-native systems are specifically designed for the manipulation of XML documents and XML-based legacy systems provide XML layers atop traditional information systems. XML for B2B applications 5 XML based data integration plays an important role in B2B application design. This importance stems from several facts. First, as a standard XML facilitates the identification of correspondence and conflicts among different components of cooperating applications. For instance, Louise Lane et al point out that e-commerce companies sell similar products and yet represent these products with different XML schema and ontologies. When several such companies merge and restructure their applications, the integration of the respective information systems and applications into uniform and homogeneous systems requires novel XML related integration tools and methodology. Second, e-commerce applications often involve comparisonshopping in which similar data are searched from different sources and presented to end users, requiring ontologies and semantic based tools to identify similar concepts and elements from different schema or information sources. Finally, the coordination of information systems that are used process client orders and support manufacturing systems, creates the needs for wide acceptance of relatively few XML like standards and related methodology to merge heterogeneous information sources. Background and contribution In the following section, several traditional interoperability approaches are reviewed, emphasizing the role of semantic resolution and translation tools. Next, interoperability in web oriented environment are discussed, with an emphasis on the important role played by XML in the design of tools for interoperable architectures. The definition of the interoperable architecture is illustrated in B2B architecture development. In the last ten years or so, several interoperability approaches have been proposed in the literature, ranging from the earlier work on database integration (Batini C., Lenzerni M., 1983) to recent ontology and semantic Web modeling approaches. They can be classified as follows based XML for B2B applications 6 on the concept or tool used to represent data and reconcile (both structural and semantic) discrepancies among the participants: 1. The database translation approach is a point-to-point solution that uses direct data mapping to resolve data heterogeneity between pairs of databases (Andersson M., 1994; Blaha M, Premerlani W., Shen H., 1994; Cluet S., Delobel C., Siméon J., Smaga K., 1998; Yan L.L., Ling T.W., 1992). This approach is appropriate when the number of participants in the interoperability environment is small. The number of data translators grows with the square of the number of participant information systems. 2. In the standardization approach, the components of the interoperability environment use the same (or standard) data model to represent data and to communicate. The standard model can be a comprehensive metamodel capable of integrating the requirements of the models of the different components (Atzeni P., Torlone R, 1997; Barsalou T., Gangopadhyay D., 1992; Jeusfeld M. A., Johnen U. A., 1994). Using of a standard metamodel reduces the number of data translators (this number grows linearly with the number of components). However, the construction of a comprehensive metamodel is a difficult task. 3. The federation approach consists of an integrated collection of heterogeneous databases in which federation users access and manipulate data transparently without knowledge of data location (Sheth A. P., Larson J. A., 1990). A federation contains a federated schema that integrates the data exported by the federation participants. There are two types of federations. Tightly coupled federations use a global federated schema constructed by federation administrator to combine the schemas of all participants while loosely coupled federation use non global federated schemas created by users or local database administrator to combine relevant schema. XML for B2B applications 7 4. The language based multi-database approach consists of a loosely connected collection of databases in which a common query language is used to access the contents of the participating databases (Keim D.A., Kriegel H.P., Miethsam A., 1994; Lakshmanan L.V.S., Sadri F., Subramian I. N., 1993). In this approach, in contrast to distributed and federated systems, the burden of creating the federated schema is placed on the users who must discover and understand the semantics of other information systems. 5. The mediation approach is based on two main components. The first component is the mediator. It is used to create and support an integrated view of data from multiple sources. The mediator provides data discovery support and various query processing services. The second component is the wrapper. It is used to map the local databases into a common federation data model. The wrapper component provides the basic data access functions (Garcia-Molina H., Hammer J., Ireland K., Papakonstantinou Y., Ullman J., Widow J., 1995). 6. The ontology based approach uses an ontology to provide an explicit conceptualization of the common domain of a collection of information systems (Benslimane D., Leclercq E., Savonnet M, Terrasse M.N, Yétongnon K., 2000). A conceptualization is an abstract description of concepts of a domain and relationships among concepts. Ontology defines a common vocabulary for users of different systems. The basic idea is to provide a common data semantics that is understood and accepted by all participants in the federation. Defining ontology for a domain is a difficult task that often requires merging overlapping ontologies. The focus of this paper is on design support for web oriented. New interoperability approaches are required to take into account the characteristics of web based information systems in the design of interoperable systems. The development of data and knowledge XML for B2B applications 8 representation formats for exchanging information and services among business applications is one important issue that must be addressed when a common (or pivot) model must be specified for pre-integration resolution of structural or syntactic differences among participating information systems. As we stated above, XML is emerging as a de facto universal standard for data description and exchange in the web. A universal meta-language must provide syntactic elements for describing structured documents. Semantic information can be associated with the description elements to customize XML to model different document types. XML based integration or interoperability can be achieved in two phases. The first phase is devoted to the creation of XML based information systems that are represented in some form of XML model while the second phase concerns the reconciliation of non-structural differences among the XML based information systems. Three major approaches can be used to achieve XML based information systems. XML-native systems are specifically designed for modeling and manipulating XML documents. They provide, in additional to the basic functionalities of traditional databases, specific APIs to allow XML databases and applications to access traditional relational databases using JDBC (Java DataBase Connectivity) or ODBC (Open DataBase Connectivity). XML-native DBMS is an emerging technology whose full capabilities has not yet been fully tested for feasibility and efficiency. In the non-XML-native approach, traditional database systems are used to model and manipulate XML document structures. The complexity and accuracy of this modeling depends on the similarity between the data model and the hierarchical structure of XML documents. For example, due to the flat structure of the relational table concept is not suitable for representing the semi-structured format and dynamic properties of XML documents. Despite this inadequacy, many relational systems provide some XML capabilities: IBM DB2 Universal Database v7.1 XML Extender DBMS provides DTD XML for B2B applications 9 functions for storing XML documents; Microsoft SQL server 2000 uses BLOB (binary Object of large size) to store XML documents; other relational DBMS such as Oracle9i XML Developer Kit and DB2 UDBS provide specific XML data types. Object-oriented capabilities can also be incorporated in DBMS for efficient manipulation of XML documents. The third approach comprises XML based legacy systems that provide translation layers at the top of traditional databases or legacy information systems to wrap and convert them to XML formats. The XML based wrapper supports for sharing legacy system schemas, specifying queries and reformatting of results. Using XML as a common data representation format addresses syntactic heterogeneity issues of information system integration. XML DTD or schemas, represented by XML grammars, can be created for participant systems to facilitate data exchange among them. However, there is also a need for XML based semantic solutions that incorporate semantics in the terms and concepts used in XML grammar description of information systems. In this paper, we propose a methodology that aims to integrate XML description elements (or concepts) into a semantic generalization tree. The generalization tree represents associations or links between semantically similar concepts. Several issues must be addressed when a generalization tree is constructed over a set of XML based concepts: How to organize and place concepts in the tree? How to apply or use relations among concepts of the generalization tree to map XML document from one XML grammar to another? To address the above issues, we present a methodology called X-TIME that can be used to support the integration of information systems. It is based on an extensible XML oriented metamodel and provides tools for data model translation and the design of wrappers or semantic mediators. X-TIME is a semantics oriented metamodel methodology aimed at achieving interoperable information systems characteristics such as XML for B2B applications 10 extensibility and composability. X-TIME is based on a set of metatypes which can be used to specify 1) meta-level semantic descriptors found in the major database models (e.g. flat structure relations, entities, objects, classes, associations etc.) and the emerging web oriented models (e.g. XML models, semi-structured data models). The metatypes are organized in a generalization hierarchy to capture their semantic similarities and correlate constituent interoperable data models. An example of data model translation in B2B application is presented to illustrate the X-TIME methodology. The remainder of the paper is organized as follows. The background and contribution of the paper have been presented in this section. Section 3 presents an overview of the design of a B2B environment based on the X-TIME methodology. The next four sections are devoted to a presentation of the tools that comprise the X-TIME methodology. The tools corresponding to the different layers of B2B platform development are specified. Section 9 concludes the paper and presents ongoing work. Overview To construct B2B platform using standard information systems, specific tools and standard data representation format are used to allow the systems to exchange and share information and services. This overview presents the main X-TIME methodology tools. We show how the tools can be used to create integrated XML metamodel components and standard data definition formats for developing B2B data exchange platform. Figure 1 depicts an example of B2B platforms development using the X-TIME methodology. It consists of three layers: specification layer, grammar integration layer and translation layer. The specification layer is devoted to the description of local schema concepts using the X-Editor tool. It allows local administrators to graphically describe local data model XML for B2B applications 11 concepts and map them into Description Logic and Backus-Naur Form (BNF) descriptions. The Description Logic represents the semantic of the concepts and the connection between them. The BNF defines the structure and constraints of the concepts. The grammar integration layer is used to build the metamodel. It takes as input is a set of description logic and BNF files of local systems and produces as output a metamodel in which the local concepts are mapped to metatypes and classified according to semantic similarities. Figure 1 Overview of the layer to build a B2B platform The metamodel is represented by as an XML-schema file. The result of the metatypes classification is an inheritance lattice (Nicolle C., Jouanot F., Cullot N., 1998) created by a subsumption mechanism in which concept definitions are compared pair wise. The metamodel represents a cooperation model tailored to the specific semantics of the set of interoperating databases. The inheritance subsumption mechanism and the resulting metatypes lattice guarantee interoperability extensibility. To add new databases to the interoperable environment, the corresponding new metatypes are compared to existing metatypes to determine the most semantically similar metatypes. The Strategic Hierarchy Builder (SHB) creates the metamodel of the final B2B environment. The goal of the translation layer is to create data model translators XML for B2B applications 12 to support information and service exchanges. This layer comprises two tools. The Transformation Rule Builder tool associates transformation or mapping rules to the metamodel while the Translator Compiler creates one or more translators from the intermediate results of the Transformation Rule Builder tool. The generation of translators using these two tools is carried out in three steps: (1) convert the XML documents representing the local schemas to intermediate meta-schema graphs by substituting occurrences of metatypes for the occurrences of source modeling concepts (XML tags), (2) convert the intermediate meta-schema graphs to equivalent meta-schema by using translation rules to carry out instance mapping between pairs of directly linked metatypes of the generalization hierarchy, and (3) use the Translator Compiler to map the meta-schema mapped to the target data model. The Translator Compiler groups all translations of metatypes from a source data model to metatypes in the target models (Translation Layer). It can generate complete XSLT (XSL Transformations) files for the automatic conversion of a XML document from a source information system to target information systems (Figure 2). Figure 2 Overview of the B2B Platform using Translators The presented methodology can be applied to Web-services to classify and integrate the XML concepts defined in the Web Services Language Description (WSDL, 2001). WSDL uses XML syntax to describe the methods and parameters of Web-services: protocols, servers, ports, XML for B2B applications 13 operations, input and output messages format, and returned exceptions. With WSDL, applications using SOAP (SOAP, 2002) can auto-configure Web-service exchanges, masking most of the low-level technical details. WSDL is the equivalent to the Interface Definition Language (IDL) used in CORBA (Common Object Request Broker Architecture) (CORBA, 2001).
منابع مشابه
XML Integration and Toolkit for B2B Applications
This paper presents a Web-based data integration methodology and tool framework, called X-TIME, for the development of business-to-business (B2B) design environments and applications. X-TIME provides a data model translator toolkit based on an extensible metamodel and XML. It allows the creation of adaptable semantics oriented metamodels to facilitate the design of wrappers or reconciliators (m...
متن کاملA Design Toolkit for Hypermedia Applications
The development process of hypermedia applications involves a variety of users with different levels of knowledge and skills. In order to get a good communication among participants, a graphical toolkit based on a method facilitates the modeling stage, documentation generation and implementation, all of them in a graphical way. In this paper, we present a design environment, AriadneTool, that a...
متن کاملB2b Automatic Taxonomy Construction
The B2B domain has already been subject to several research experiences, but we believe that the real advantage of introducing semantic technologies within enterprise application integration has not yet been investigated fully. In this paper we provide a new use case for the next generation Semantic Web applications with regards to enterprise application integration. We also present the results...
متن کاملConceptModeller: a Graph-Based Semantic Modeling Tool for Building Enterprise Applications
The paper outlines semantic-oriented methodology of enterprise software development. The methodology provides integrated visual semantic-oriented enterprise software development and integration in globally distributed heterogeneous environment. The ConceptModeller CASE tool fills the gap between formal computer science models and software engineering practices. The toolkit transforms frame-base...
متن کاملThe Connectome Viewer Toolkit: An Open Source Framework to Manage, Analyze, and Visualize Connectomes
Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit - a set of free and extensible open source neuroimaging tools written in Python. The key components of t...
متن کامل